Random mappings designed for commercial search engines
نویسندگان
چکیده
We give a practical random mapping that takes any set of documents represented as vectors inEuclidean space and then maps them to a sparse subset of the Hamming cube while retaining ordering ofinter-vector inner products. Once represented in the sparse space, it is natural to index documents usingcommercial text-based search engines which are specialized to take advantage of this sparse and discretestructure for large-scale document retrieval. We give a theoretical analysis of the mapping scheme,characterizing exact asymptotic behavior and also giving non-asymptotic bounds which we verify throughnumerical simulations. We balance the theoretical treatment with several practical considerations; theseallow substantial speed up of the method. We further illustrate the use of this method on search overtwo real data sets: a corpus of images represented by their color histograms, and a corpus of daily stockmarket index values.
منابع مشابه
Context-Aware Online Commercial Intention Detection
With more and more commercial activities moving onto the Internet, people tend to purchase what they need through Internet or conduct some online research before the actual transactions happen. For many Web users, their online commercial activities start from submitting a search query to search engines. Just like the common Web search queries, the queries with commercial intention are usually v...
متن کاملThe Random Neural Network Applied to an Intelligent Search Assistant
Users can not guarantee the results they obtain from Web search engines are exhaustive, or that they actually respond to their needs. Search results are influenced by the users’ own ambiguity in formulating their requests or queries as well as by the commercial interest of Web search engines and Internet users that want to reach a wider audience. This paper presents an Intelligent Search Assist...
متن کاملDiscovering Popular Clicks\' Pattern of Teen Users for Query Recommendation
Search engines are still the most important gates for information search in internet. In this regard, providing the best response in the shortest time possible to the user's request is still desired. Normally, search engines are designed for adults and few policies have been employed considering teen users. Teen users are more biased in clicking the results list than are adult users. This leads...
متن کاملA search quality evaluation based on objective-subjective method
Commercial search engines, especially meta-search engines was designed to retrieve the information by submitting users’ queries to multiple conventional search engines and integrating their partial searching results generated by each search engine. How to find the best search engine for user queries is know as the selection problem of search engines. Recently, the selection problem has become a...
متن کاملDesign Alternatives for Large - Scale Web Search :
Indexing the Web and meeting the throughput, responsetime, and failure-resilience requirements of a search engine requires massive storage and computational resources and a careful system design for scalability. This is exemplified by the big data centers of the leading commercial search engines. Various proposals and debates have appeared in the literature as to whether Web indexes can be impl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1507.05929 شماره
صفحات -
تاریخ انتشار 2015